Search CORE

24 research outputs found

Parallel Hybrid Evolutionary Algorithms on GPU

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceOver the last years, interest in hybrid metaheuristics has risen considerably in the field of optimization. Combinations of methods such as evolutionary algorithms and local searches have provided very powerful search algorithms. However, due to their complexity, the computational time of the solution search exploration remains exorbitant when large problem instances are to be solved. Therefore, the use of GPU-based parallel computing is required as a complementary way to speed up the search. This paper presents a new methodology to design and implement efficiently and effectively hybrid evolutionary algorithms on GPU accelerators. The methodology enables efficient mappings of the explored search space onto the GPU memory hierarchy. The experimental results show that the approach is very efficient especially for large problem instances

HAL - Lille 3

INRIA a CCSD electronic archive server

GPU Computing for Parallel Local Search Metaheuristics

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

International audienceLocal search metaheuristics (LSMs) are efficient methods for solving complex problems in science and industry. They allow significantly to reduce the size of the search space to be explored and the search time. Nevertheless, the resolution time remains prohibitive when dealing with large problem instances. Therefore, the use of GPU-based massively parallel computing is a major complementary way to speed up the search. However, GPU computing for LSMs is rarely investigated in the literature. In this paper, we introduce a new guideline for the design and implementation of effective LSMs on GPU. Very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of neighboring solutions to GPU threads and memory management. These approaches have been experimented using four well-known combinatorial and continuous optimization problems and four GPU configurations. Compared to a CPU-based execution, accelerations up to x80 are reported for the large combinatorial problems and up to x240 for a continuous problem. Finally, extensive experiments demonstrate the strong potential of GPU-based LSMs compared to cluster or grid-based parallel architectures

HAL - Lille 3

INRIA a CCSD electronic archive server

Multi-start local search algorithms on GPU

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceIn practice, combinatorial optimization problems are complex and computationally time-intensive. Even if local search (LS) algorithms allow to significantly reduce the computation time cost of the solution exploration space, the use of parallelism is required to accelerate the search process. Indeed, LSs are inherently parallel and three parallel models are often used to solve efficiently large combinatorial problems: algorithmic-level (multi-start model), iteration-level (parallel evaluation of the neighborhood), and the solution-level (parallel evaluation of a single solution). The main objective of this paper is to deal with the algorithmic-level on GPU architectures where many LSs are executed in parallel. More exactly, we propose to study different schemes of deployment for the design of multi-start LSs on GPU based on popular hill climbing (HC), simulated annealing (SA) and tabu search (TS)

HAL - Lille 3

INRIA a CCSD electronic archive server

Parallel Local Search on GPU

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

www.lifl.fr/~luongLocal search algorithms are a class of algorithms to solve complex optimization problems in science and industry. Even if these metaheuristics allow to significantly reduce the computational time of the solution exploration space, the iterative process remains costly when very large problem instances are dealt with. As a solution, graphics processing units (GPUs) represent an efficient alternative for calculations instead of traditional CPU. This paper presents a new methodology to design and implement local search algorithms on GPU. Methods such as tabu search, hill climbing or iterated local search present similar concepts that can be parallelized on GPU and then a general cooperative model can be highlighted. In addition to single-solution based metaheuristics on GPU, this model can be extended with a hybrid multi-core and multi-GPU approach for multiple local search methods such as multistart. The conclusions from both GPU and multi-GPU experiments indicate significant speed-ups compared to CPU approaches

HAL - Lille 3

INRIA a CCSD electronic archive server

Algorithmes évolutionnaires parallèles sur GPU

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

National audienceLes algorithmes d'optimisation tels que les algorithmes évolutionnaires sont des méthodes efficaces pour résoudre des problèmes complexes en sciences et en industrie. Même si ces heuristiques permettent de réduire de manière significative le temps de calcul de l'exploration de l'espace de recherche d'une solution, ce dernier coût reste exorbitant lorsque de très grandes instances d'un problème sont résolues. Ainsi, l'utilisation du calcul parallèle à base de GPU est requise comme une façon complémentaire d'accélérer la recherche. Dans ce papier, on se concentra ainsi sur leur reconception, leur implémentation et les difficultés associées relatifs au contexte d'exécution du GPU. Les résultats expérimentaux obtenus démontrent l'efficacité des approches proposées et leur capacité d'exploiter pleinement l'architecture du GPU

HAL - Lille 3

INRIA a CCSD electronic archive server

A GPU-based Iterated Tabu Search for Solving the Quadratic 3-dimensional Assignment Problem

Author: Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 01/01/2010
Field of study

International audienceThe quadratic 3-dimensional assignment problem (Q3AP) is an extension of the well-known NP-hard quadratic assignment problem. It has been proved to be one of the most difficult combinatorial optimization problems. Local search (LS) algorithms are a class of heuristics which have been successfully applied to solve such hard optimization problem. These methods handle with a single solution iteratively improved by exploring its neighborhood in the solution space. In this paper, we propose an iterated tabu search for solving the Q3AP. The design of this algorithm is essentially based on a new large neighborhood structure. Indeed, in LS heuristics, designing operators to explore large promising regions of the search space may improve the quality of the obtained solutions. However, designing such neighborhood is at the expense of a highly computationally process. Therefore, the use of graphics processing units (GPUs) provides an efficient complementary way to speed up the search. The proposed GPU-based iterated tabu search has been experimented on 5 different Q3AP instances. The obtained results are convincing both in terms of efficiency, quality and robustness of the provided solutions at run time

HAL - Lille 3

INRIA a CCSD electronic archive server

Towards ParadisEO-MO-GPU: a Framework for GPU-based Local Search Metaheuristics

Author: Karima Boufaras
Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2011
Field of study

International audienceThis paper is a major step towards a pioneering software framework for the reusable design and implementation of parallel metaheuristics on Graphics Processing Units (GPU). The objective is to revisit the ParadisEO framework to allow its utilization on GPU accelerators. The focus is on local search metaheuristics and the parallel exploration of their neighborhood. The challenge is to make the GPU as transparent as possible for the user. The first release of the new GPU-based ParadisEO framework has been experimented on the Quadratic Assignment Problem (QAP). The preliminary results are convincing, both in terms of flexibility and easiness of reuse at implementation, and in terms of efficiency at execution on GPU

HAL - Lille 3

INRIA a CCSD electronic archive server

ParadisEO-MO-GPU: a Framework for Parallel GPU-based Local Search Metaheuristics

Author: Boufaras Karima
Luong Thé Van
Melab Nouredine
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 06/07/2013
Field of study

International audienceIn this paper, we propose a pioneering framework called ParadisEO-MO-GPU for the reusable design and implementation of parallel local search metaheuristics (S- Metaheuristics) on Graphics Processing Units (GPU). We revisit the ParadisEO-MO software framework to allow its utilization on GPU accelerators focusing on the parallel iteration-level model, the major parallel model for S- Metaheuristics. It consists in the parallel exploration of the neighborhood of a problem solution. The challenge is on the one hand to rethink the design and implementation of this model optimizing the data transfer between the CPU and the GPU. On the other hand, the objective is to make the GPU as transparent as possible for the user minimizing his or her involvement in its management. In this paper, we propose solutions to this challenge as an extension of the ParadisEO framework. The first release of the new GPU-based ParadisEO framework has been experimented on the permuted perceptron problem. The preliminary results are convincing, both in terms of flexibility and easiness of reuse at implementation, and in terms of efficiency at execution on GPU

HAL - Lille 3

INRIA a CCSD electronic archive server

Métaheuristiques parallèles sur GPU

Author: Van Luong Thé
Publication venue: HAL CCSD
Publication date: 01/12/2011
Field of study

This thesis is written in EnglishReal-world optimization problems are often complex and NP-hard. Their modeling is continuously evolving in terms of constraints and objectives, and their resolution is CPU time-consuming. Although near-optimal algorithms such as metaheuristics (generic heuristics) make it possible to reduce the temporal complexity of their resolution, they fail to tackle large problems satisfactorily. Over the last decades, parallel computing has been revealed as an unavoidable way to deal with large problem instances of difficult optimization problems. The design and implementation of parallel metaheuristics are strongly influenced by the computing platform. Nowadays, GPU computing has recently been revealed effective to deal with time-intensive problems. This new emerging technology is believed to be extremely useful to speed up many complex algorithms. One of the major issues for metaheuristics is to rethink existing parallel models and programming paradigms to allow their deployment on GPU accelerators. Generally speaking, the major issues we have to deal with are: the distribution of data processing between CPU and GPU, the thread synchronization, the optimization of data transfer between the different memories, the memory capacity constraints, etc. The contribution of this thesis is to deal with such issues for the redesign of parallel models of metaheuristics to allow solving of large scale optimization problems on GPU architectures. Our objective is to rethink the existing parallel models and to enable their deployment on GPUs. Thereby, we propose in this document a new generic guideline for building efficient parallel metaheuristics on GPU. Our challenge is to come out with the GPU-based design of the whole hierarchy of parallel models. In this purpose, very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of solutions to GPU threads or memory management. These approaches have been exhaustively experimented using five optimization problems and four GPU configurations. Compared to a CPU-based execution, experiments report up to 80-fold acceleration for large combinatorial problems and up to 2000-fold speed-up for a continuous problem. The different works related to this thesis have been accepted in a dozen of publications, including the IEEE Transactions on Computers journal.Les problèmes d'optimisation issus du monde réel sont souvent complexes et NP-difficiles. Leur modélisation est en constante évolution en termes de contraintes et d'objectifs, et leur résolution est coûteuse en temps de calcul. Bien que des algorithmes approchés telles que les métaheuristiques (heuristiques génériques) permettent de réduire la complexité de leur résolution, ces méthodes restent insuffisantes pour traiter des problèmes de grande taille. Au cours des dernières décennies, le calcul parallèle s'est révélé comme un moyen incontournable pour faire face à de grandes instances de problèmes difficiles d'optimisation. La conception et l'implémentation de métaheuristiques parallèles sont ainsi fortement influencées par l'architecture parallèle considérée. De nos jours, le calcul sur GPU s'est récemment révélé efficace pour traiter des problèmes coûteux en temps de calcul. Cette nouvelle technologie émergente est considérée comme extrêmement utile pour accélérer de nombreux algorithmes complexes. Un des enjeux majeurs pour les métaheuristiques est de repenser les modèles existants et les paradigmes de programmation parallèle pour permettre leur déploiement sur les accélérateurs GPU. De manière générale, les problèmes qui se posent sont la répartition des tâches entre le CPU et le GPU, la synchronisation des threads, l'optimisation des transferts de données entre les différentes mémoires, les contraintes de capacité mémoire, etc. La contribution de cette thèse est de faire face à ces problèmes pour la reconception des modèles parallèles des métaheuristiques pour permettre la résolution des problèmes d'optimisation à large échelle sur les architectures GPU. Notre objectif est de repenser les modèles parallèles existants et de permettre leur déploiement sur GPU. Ainsi, nous proposons dans ce document une nouvelle ligne directrice pour la construction de métaheuristiques parallèles efficaces sur GPU. Le défi de cette thèse porte sur la conception de toute la hiérarchie des modèles parallèles sur GPU. Pour cela, des approches très efficaces ont été proposées pour l'optimisation des transferts de données entre le CPU et le GPU, le contrôle de threads, l'association entre les solutions et les threads, ou encore la gestion de la mémoire. Les approches proposées ont été expérimentées de façon exhaustive en utilisant cinq problèmes d'optimisation et quatre configurations GPU. En comparaison avec une exécution sur CPU, les accélérations obtenues vont jusqu'à 80 fois plus vite pour des grands problèmes d'optimisation combinatoire et jusqu'à 2000 fois plus vite pour un problème d'optimisation continue. Les différents travaux liés à cette thèse ont fait l'objet d'une douzaine publications comprenant la revue IEEE Transactions on Computers

HAL - Lille 3

Thèses en Ligne

INRIA a CCSD electronic archive server

Parallel metaheuristics on GPU

Author: Luong Thé Van
Publication venue
Publication date: 01/12/2011
Field of study

Les problèmes d'optimisation issus du monde réel sont souvent complexes et NP-difficiles. Leur modélisation est en constante évolution en termes de contraintes et d'objectifs, et leur résolution est coûteuse en temps de calcul. Bien que des algorithmes approchés telles que les métaheuristiques (heuristiques génériques) permettent de réduire la complexité de leur résolution, ces méthodes restent insuffisantes pour traiter des problèmes de grande taille. Au cours des dernières décennies, le calcul parallèle s'est révélé comme un moyen incontournable pour faire face à de grandes instances de problèmes difficiles d'optimisation. La conception et l'implémentation de métaheuristiques parallèles sont ainsi fortement influencées par l'architecture parallèle considérée. De nos jours, le calcul sur GPU s'est récemment révélé efficace pour traiter des problèmes coûteux en temps de calcul. Cette nouvelle technologie émergente est considérée comme extrêmement utile pour accélérer de nombreux algorithmes complexes. Un des enjeux majeurs pour les métaheuristiques est de repenser les modèles existants et les paradigmes de programmation parallèle pour permettre leurdéploiement sur les accélérateurs GPU. De manière générale, les problèmes qui se posent sont la répartition des tâches entre le CPU et le GPU, la synchronisation des threads, l'optimisation des transferts de données entre les différentes mémoires, les contraintes de capacité mémoire, etc. La contribution de cette thèse est de faire face à ces problèmes pour la reconception des modèles parallèles des métaheuristiques pour permettre la résolution des problèmes d'optimisation à large échelle sur les architectures GPU. Notre objectif est de repenser les modèles parallèles existants et de permettre leur déploiement sur GPU. Ainsi, nous proposons dans ce document une nouvelle ligne directrice pour la construction de métaheuristiques parallèles efficaces sur GPU. Le défi de cette thèse porte sur la conception de toute la hiérarchie des modèles parallèles sur GPU. Pour cela, des approches très efficaces ont été proposées pour l'optimisation des transferts de données entre le CPU et le GPU, le contrôle de threads, l'association entre les solutions et les threads, ou encore la gestion de la mémoire. Les approches proposées ont été expérimentées de façon exhaustive en utilisant cinq problèmes d'optimisation et quatre configurations GPU. En comparaison avec une exécution sur CPU, les accélérations obtenues vont jusqu'à 80 fois plus vite pour des grands problèmes d'optimisation combinatoire et jusqu'à 2000 fois plus vite pour un problème d'optimisation continue. Les différents travaux liés à cette thèse ont fait l'objet d'une douzaine publications comprenant la revue IEEE Transactions on Computers.Real-world optimization problems are often complex and NP-hard. Their modeling is continuously evolving in terms of constraints and objectives, and their resolution is CPU time-consuming. Although near-optimal algorithms such as metaheuristics (generic heuristics) make it possible to reduce the temporal complexity of their resolution, they fail to tackle large problems satisfactorily. Over the last decades, parallel computing has been revealed as an unavoidable way to deal with large problem instances of difficult optimization problems. The design and implementation of parallel metaheuristics are strongly influenced by the computing platform. Nowadays, GPU computing has recently been revealed effective to deal with time-intensive problems. This new emerging technology is believed to be extremely useful to speed up many complex algorithms. One of the major issues for metaheuristics is to rethink existing parallel models and programming paradigms to allow their deployment on GPU accelerators. Generally speaking, the major issues we have to deal with are: the distribution of data processing between CPU and GPU, the thread synchronization, the optimization of data transfer between the different memories, the memory capacity constraints, etc. The contribution of this thesis is to deal with such issues for the redesign of parallel models of metaheuristics to allow solving of large scale optimization problems on GPU architectures. Our objective is to rethink the existing parallel models and to enable their deployment on GPUs. Thereby, we propose in this document a new generic guideline for building efficient parallel metaheuristics on GPU. Our challenge is to come out with the GPU-based design of the whole hierarchy of parallel models.In this purpose, very efficient approaches are proposed for CPU-GPU data transfer optimization, thread control, mapping of solutions to GPU threadsor memory management. These approaches have been exhaustively experimented using five optimization problems and four GPU configurations. Compared to a CPU-based execution, experiments report up to 80-fold acceleration for large combinatorial problems and up to 2000-fold speed-up for a continuous problem. The different works related to this thesis have been accepted in a dozen of publications, including the IEEE Transactions on Computers journal

Theses.fr